Overview

Dataset statistics

Number of variables12
Number of observations800
Missing cells387
Missing cells (%)4.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory69.7 KiB
Average record size in memory89.2 B

Variable types

Numeric8
Categorical3
Boolean1

Warnings

Name has a high cardinality: 799 distinct values High cardinality
# is highly correlated with GenerationHigh correlation
Generation is highly correlated with #High correlation
Type 2 has 386 (48.2%) missing values Missing
# is uniformly distributed Uniform
Name is uniformly distributed Uniform
# has unique values Unique

Reproduction

Analysis started2021-04-08 06:50:41.657851
Analysis finished2021-04-08 06:50:58.810943
Duration17.15 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

#
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct800
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean400.5
Minimum1
Maximum800
Zeros0
Zeros (%)0.0%
Memory size6.4 KiB
2021-04-08T15:50:58.954252image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile40.95
Q1200.75
median400.5
Q3600.25
95-th percentile760.05
Maximum800
Range799
Interquartile range (IQR)399.5

Descriptive statistics

Standard deviation231.0844002
Coefficient of variation (CV)0.5769897632
Kurtosis-1.2
Mean400.5
Median Absolute Deviation (MAD)200
Skewness0
Sum320400
Variance53400
MonotocityStrictly increasing
2021-04-08T15:50:59.146693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8001
 
0.1%
2631
 
0.1%
2731
 
0.1%
2721
 
0.1%
2711
 
0.1%
2701
 
0.1%
2691
 
0.1%
2681
 
0.1%
2671
 
0.1%
2661
 
0.1%
Other values (790)790
98.8%
ValueCountFrequency (%)
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
ValueCountFrequency (%)
8001
0.1%
7991
0.1%
7981
0.1%
7971
0.1%
7961
0.1%

Name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct799
Distinct (%)100.0%
Missing1
Missing (%)0.1%
Memory size6.4 KiB
Gothitelle
 
1
Flygon
 
1
Mega Absol
 
1
Machop
 
1
Gabite
 
1
Other values (794)
794 

Length

Max length25
Median length8
Mean length8.375469337
Min length3

Characters and Unicode

Total characters6692
Distinct characters60
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique799 ?
Unique (%)100.0%

Sample

1st rowBulbasaur
2nd rowIvysaur
3rd rowVenusaur
4th rowMega Venusaur
5th rowCharmander
ValueCountFrequency (%)
Gothitelle1
 
0.1%
Flygon1
 
0.1%
Mega Absol1
 
0.1%
Machop1
 
0.1%
Gabite1
 
0.1%
Zapdos1
 
0.1%
Talonflame1
 
0.1%
Chikorita1
 
0.1%
Ledian1
 
0.1%
Lillipup1
 
0.1%
Other values (789)789
98.6%
2021-04-08T15:50:59.555931image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mega48
 
5.1%
forme21
 
2.2%
size8
 
0.9%
rotom6
 
0.6%
kyurem5
 
0.5%
pumpkaboo4
 
0.4%
gourgeist4
 
0.4%
therian3
 
0.3%
deoxys3
 
0.3%
wormadam3
 
0.3%
Other values (756)830
88.8%

Most occurring characters

ValueCountFrequency (%)
a632
 
9.4%
e609
 
9.1%
o528
 
7.9%
r478
 
7.1%
i444
 
6.6%
n360
 
5.4%
l354
 
5.3%
t297
 
4.4%
u239
 
3.6%
s205
 
3.1%
Other values (50)2546
38.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5611
83.8%
Uppercase Letter937
 
14.0%
Space Separator136
 
2.0%
Other Punctuation3
 
< 0.1%
Other Symbol2
 
< 0.1%
Dash Punctuation2
 
< 0.1%
Decimal Number1
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
a632
11.3%
e609
10.9%
o528
 
9.4%
r478
 
8.5%
i444
 
7.9%
n360
 
6.4%
l354
 
6.3%
t297
 
5.3%
u239
 
4.3%
s205
 
3.7%
Other values (17)1465
26.1%
ValueCountFrequency (%)
S129
13.8%
M120
12.8%
C62
 
6.6%
G58
 
6.2%
P56
 
6.0%
F50
 
5.3%
A47
 
5.0%
D47
 
5.0%
B45
 
4.8%
T44
 
4.7%
Other values (16)279
29.8%
ValueCountFrequency (%)
1
50.0%
1
50.0%
ValueCountFrequency (%)
.2
66.7%
'1
33.3%
ValueCountFrequency (%)
136
100.0%
ValueCountFrequency (%)
21
100.0%
ValueCountFrequency (%)
-2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6548
97.8%
Common144
 
2.2%

Most frequent character per script

ValueCountFrequency (%)
a632
 
9.7%
e609
 
9.3%
o528
 
8.1%
r478
 
7.3%
i444
 
6.8%
n360
 
5.5%
l354
 
5.4%
t297
 
4.5%
u239
 
3.6%
s205
 
3.1%
Other values (43)2402
36.7%
ValueCountFrequency (%)
136
94.4%
.2
 
1.4%
-2
 
1.4%
1
 
0.7%
1
 
0.7%
'1
 
0.7%
21
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII6688
99.9%
Misc Symbols2
 
< 0.1%
None2
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
a632
 
9.4%
e609
 
9.1%
o528
 
7.9%
r478
 
7.1%
i444
 
6.6%
n360
 
5.4%
l354
 
5.3%
t297
 
4.4%
u239
 
3.6%
s205
 
3.1%
Other values (47)2542
38.0%
ValueCountFrequency (%)
1
50.0%
1
50.0%
ValueCountFrequency (%)
é2
100.0%

Type 1
Categorical

Distinct18
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size6.4 KiB
Water
112 
Normal
98 
Grass
70 
Bug
69 
Psychic
57 
Other values (13)
394 

Length

Max length8
Median length5
Mean length5.26
Min length3

Characters and Unicode

Total characters4208
Distinct characters28
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGrass
2nd rowGrass
3rd rowGrass
4th rowGrass
5th rowFire
ValueCountFrequency (%)
Water112
14.0%
Normal98
12.2%
Grass70
 
8.8%
Bug69
 
8.6%
Psychic57
 
7.1%
Fire52
 
6.5%
Electric44
 
5.5%
Rock44
 
5.5%
Ghost32
 
4.0%
Dragon32
 
4.0%
Other values (8)190
23.8%
2021-04-08T15:50:59.867257image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
water112
14.0%
normal98
12.2%
grass70
 
8.8%
bug69
 
8.6%
psychic57
 
7.1%
fire52
 
6.5%
rock44
 
5.5%
electric44
 
5.5%
ghost32
 
4.0%
dragon32
 
4.0%
Other values (8)190
23.8%

Most occurring characters

ValueCountFrequency (%)
r488
 
11.6%
a360
 
8.6%
o294
 
7.0%
e286
 
6.8%
c270
 
6.4%
s257
 
6.1%
i256
 
6.1%
t242
 
5.8%
l173
 
4.1%
g159
 
3.8%
Other values (18)1423
33.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3408
81.0%
Uppercase Letter800
 
19.0%

Most frequent character per category

ValueCountFrequency (%)
r488
14.3%
a360
10.6%
o294
8.6%
e286
8.4%
c270
7.9%
s257
7.5%
i256
7.5%
t242
 
7.1%
l173
 
5.1%
g159
 
4.7%
Other values (7)623
18.3%
ValueCountFrequency (%)
G134
16.8%
W112
14.0%
F100
12.5%
N98
12.2%
P85
10.6%
B69
8.6%
D63
7.9%
E44
 
5.5%
R44
 
5.5%
S27
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Latin4208
100.0%

Most frequent character per script

ValueCountFrequency (%)
r488
 
11.6%
a360
 
8.6%
o294
 
7.0%
e286
 
6.8%
c270
 
6.4%
s257
 
6.1%
i256
 
6.1%
t242
 
5.8%
l173
 
4.1%
g159
 
3.8%
Other values (18)1423
33.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII4208
100.0%

Most frequent character per block

ValueCountFrequency (%)
r488
 
11.6%
a360
 
8.6%
o294
 
7.0%
e286
 
6.8%
c270
 
6.4%
s257
 
6.1%
i256
 
6.1%
t242
 
5.8%
l173
 
4.1%
g159
 
3.8%
Other values (18)1423
33.8%

Type 2
Categorical

MISSING

Distinct18
Distinct (%)4.3%
Missing386
Missing (%)48.2%
Memory size6.4 KiB
Flying
97 
Ground
35 
Poison
34 
Psychic
33 
Fighting
26 
Other values (13)
189 

Length

Max length8
Median length6
Mean length5.652173913
Min length3

Characters and Unicode

Total characters2340
Distinct characters28
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPoison
2nd rowPoison
3rd rowPoison
4th rowPoison
5th rowFlying
ValueCountFrequency (%)
Flying97
 
12.1%
Ground35
 
4.4%
Poison34
 
4.2%
Psychic33
 
4.1%
Fighting26
 
3.2%
Grass25
 
3.1%
Fairy23
 
2.9%
Steel22
 
2.8%
Dark20
 
2.5%
Dragon18
 
2.2%
Other values (8)81
 
10.1%
(Missing)386
48.2%
2021-04-08T15:51:00.170678image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
flying97
23.4%
ground35
 
8.5%
poison34
 
8.2%
psychic33
 
8.0%
fighting26
 
6.3%
grass25
 
6.0%
fairy23
 
5.6%
steel22
 
5.3%
dark20
 
4.8%
dragon18
 
4.3%
Other values (8)81
19.6%

Most occurring characters

ValueCountFrequency (%)
i257
 
11.0%
n210
 
9.0%
g170
 
7.3%
F158
 
6.8%
r157
 
6.7%
o153
 
6.5%
y153
 
6.5%
s131
 
5.6%
l129
 
5.5%
c106
 
4.5%
Other values (18)716
30.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1926
82.3%
Uppercase Letter414
 
17.7%

Most frequent character per category

ValueCountFrequency (%)
i257
13.3%
n210
10.9%
g170
8.8%
r157
8.2%
o153
7.9%
y153
7.9%
s131
 
6.8%
l129
 
6.7%
c106
 
5.5%
a104
 
5.4%
Other values (7)356
18.5%
ValueCountFrequency (%)
F158
38.2%
G74
17.9%
P67
16.2%
D38
 
9.2%
S22
 
5.3%
I14
 
3.4%
R14
 
3.4%
W14
 
3.4%
E6
 
1.4%
N4
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2340
100.0%

Most frequent character per script

ValueCountFrequency (%)
i257
 
11.0%
n210
 
9.0%
g170
 
7.3%
F158
 
6.8%
r157
 
6.7%
o153
 
6.5%
y153
 
6.5%
s131
 
5.6%
l129
 
5.5%
c106
 
4.5%
Other values (18)716
30.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII2340
100.0%

Most frequent character per block

ValueCountFrequency (%)
i257
 
11.0%
n210
 
9.0%
g170
 
7.3%
F158
 
6.8%
r157
 
6.7%
o153
 
6.5%
y153
 
6.5%
s131
 
5.6%
l129
 
5.5%
c106
 
4.5%
Other values (18)716
30.6%

HP
Real number (ℝ≥0)

Distinct94
Distinct (%)11.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.25875
Minimum1
Maximum255
Zeros0
Zeros (%)0.0%
Memory size6.4 KiB
2021-04-08T15:51:00.326818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile35.95
Q150
median65
Q380
95-th percentile110
Maximum255
Range254
Interquartile range (IQR)30

Descriptive statistics

Standard deviation25.53466903
Coefficient of variation (CV)0.368685098
Kurtosis7.232078374
Mean69.25875
Median Absolute Deviation (MAD)15
Skewness1.568224376
Sum55407
Variance652.0193226
MonotocityNot monotonic
2021-04-08T15:51:00.696801image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6067
 
8.4%
5063
 
7.9%
7057
 
7.1%
6546
 
5.8%
7543
 
5.4%
8043
 
5.4%
4038
 
4.8%
4538
 
4.8%
5537
 
4.6%
10032
 
4.0%
Other values (84)336
42.0%
ValueCountFrequency (%)
11
 
0.1%
101
 
0.1%
206
0.8%
252
 
0.2%
281
 
0.1%
ValueCountFrequency (%)
2551
0.1%
2501
0.1%
1901
0.1%
1701
0.1%
1651
0.1%

Attack
Real number (ℝ≥0)

Distinct111
Distinct (%)13.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.00125
Minimum5
Maximum190
Zeros0
Zeros (%)0.0%
Memory size6.4 KiB
2021-04-08T15:51:00.883985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile30
Q155
median75
Q3100
95-th percentile136.2
Maximum190
Range185
Interquartile range (IQR)45

Descriptive statistics

Standard deviation32.45736587
Coefficient of variation (CV)0.4108462318
Kurtosis0.1697173149
Mean79.00125
Median Absolute Deviation (MAD)20
Skewness0.551613748
Sum63201
Variance1053.480599
MonotocityNot monotonic
2021-04-08T15:51:01.062548image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10040
 
5.0%
6539
 
4.9%
8037
 
4.6%
5037
 
4.6%
8533
 
4.1%
6033
 
4.1%
7532
 
4.0%
7031
 
3.9%
9030
 
3.8%
5530
 
3.8%
Other values (101)458
57.2%
ValueCountFrequency (%)
52
 
0.2%
103
 
0.4%
151
 
0.1%
208
1.0%
221
 
0.1%
ValueCountFrequency (%)
1901
 
0.1%
1851
 
0.1%
1803
0.4%
1702
0.2%
1653
0.4%

Defense
Real number (ℝ≥0)

Distinct103
Distinct (%)12.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73.8425
Minimum5
Maximum230
Zeros0
Zeros (%)0.0%
Memory size6.4 KiB
2021-04-08T15:51:01.255642image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile35
Q150
median70
Q390
95-th percentile130
Maximum230
Range225
Interquartile range (IQR)40

Descriptive statistics

Standard deviation31.18350056
Coefficient of variation (CV)0.422297465
Kurtosis2.72626036
Mean73.8425
Median Absolute Deviation (MAD)20
Skewness1.155912303
Sum59074
Variance972.4107071
MonotocityNot monotonic
2021-04-08T15:51:01.433188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7054
 
6.8%
5049
 
6.1%
6046
 
5.8%
8039
 
4.9%
4036
 
4.5%
6536
 
4.5%
9035
 
4.4%
10033
 
4.1%
5532
 
4.0%
4532
 
4.0%
Other values (93)408
51.0%
ValueCountFrequency (%)
52
0.2%
101
 
0.1%
154
0.5%
204
0.5%
231
 
0.1%
ValueCountFrequency (%)
2303
0.4%
2002
0.2%
1841
 
0.1%
1803
0.4%
1681
 
0.1%

Sp. Atk
Real number (ℝ≥0)

Distinct105
Distinct (%)13.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72.82
Minimum10
Maximum194
Zeros0
Zeros (%)0.0%
Memory size6.4 KiB
2021-04-08T15:51:01.616015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile30
Q149.75
median65
Q395
95-th percentile131.05
Maximum194
Range184
Interquartile range (IQR)45.25

Descriptive statistics

Standard deviation32.72229417
Coefficient of variation (CV)0.4493586126
Kurtosis0.2978936607
Mean72.82
Median Absolute Deviation (MAD)20
Skewness0.7446624978
Sum58256
Variance1070.748536
MonotocityNot monotonic
2021-04-08T15:51:01.800267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6051
 
6.4%
4049
 
6.1%
6544
 
5.5%
5039
 
4.9%
5535
 
4.4%
4533
 
4.1%
7030
 
3.8%
3529
 
3.6%
8527
 
3.4%
8027
 
3.4%
Other values (95)436
54.5%
ValueCountFrequency (%)
103
 
0.4%
154
0.5%
208
1.0%
231
 
0.1%
242
 
0.2%
ValueCountFrequency (%)
1941
 
0.1%
1803
0.4%
1751
 
0.1%
1703
0.4%
1652
0.2%

Sp. Def
Real number (ℝ≥0)

Distinct92
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71.9025
Minimum20
Maximum230
Zeros0
Zeros (%)0.0%
Memory size6.4 KiB
2021-04-08T15:51:01.993938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile32.95
Q150
median70
Q390
95-th percentile120
Maximum230
Range210
Interquartile range (IQR)40

Descriptive statistics

Standard deviation27.8289158
Coefficient of variation (CV)0.3870368318
Kurtosis1.628394057
Mean71.9025
Median Absolute Deviation (MAD)20
Skewness0.8540186115
Sum57522
Variance774.4485544
MonotocityNot monotonic
2021-04-08T15:51:02.166805image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8052
 
6.5%
5050
 
6.2%
5547
 
5.9%
6544
 
5.5%
6043
 
5.4%
7540
 
5.0%
7040
 
5.0%
9036
 
4.5%
4535
 
4.4%
8530
 
3.8%
Other values (82)383
47.9%
ValueCountFrequency (%)
206
 
0.8%
231
 
0.1%
2511
1.4%
3020
2.5%
311
 
0.1%
ValueCountFrequency (%)
2301
 
0.1%
2001
 
0.1%
1602
 
0.2%
1543
0.4%
1507
0.9%

Speed
Real number (ℝ≥0)

Distinct108
Distinct (%)13.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.2775
Minimum5
Maximum180
Zeros0
Zeros (%)0.0%
Memory size6.4 KiB
2021-04-08T15:51:02.342614image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile25
Q145
median65
Q390
95-th percentile115
Maximum180
Range175
Interquartile range (IQR)45

Descriptive statistics

Standard deviation29.06047372
Coefficient of variation (CV)0.4256229903
Kurtosis-0.2364366728
Mean68.2775
Median Absolute Deviation (MAD)21
Skewness0.3579332951
Sum54622
Variance844.5111327
MonotocityNot monotonic
2021-04-08T15:51:02.521998image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5046
 
5.8%
6044
 
5.5%
7037
 
4.6%
6536
 
4.5%
3035
 
4.4%
8033
 
4.1%
4032
 
4.0%
9031
 
3.9%
10031
 
3.9%
5530
 
3.8%
Other values (98)445
55.6%
ValueCountFrequency (%)
52
 
0.2%
103
 
0.4%
159
1.1%
2015
1.9%
221
 
0.1%
ValueCountFrequency (%)
1801
 
0.1%
1601
 
0.1%
1504
0.5%
1453
0.4%
1402
0.2%

Generation
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.32375
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Memory size6.4 KiB
2021-04-08T15:51:02.673185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.6612904
Coefficient of variation (CV)0.4998241145
Kurtosis-1.239575758
Mean3.32375
Median Absolute Deviation (MAD)2
Skewness0.01425810028
Sum2659
Variance2.759885795
MonotocityIncreasing
2021-04-08T15:51:02.803945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1166
20.8%
5165
20.6%
3160
20.0%
4121
15.1%
2106
13.2%
682
10.2%
ValueCountFrequency (%)
1166
20.8%
2106
13.2%
3160
20.0%
4121
15.1%
5165
20.6%
ValueCountFrequency (%)
682
10.2%
5165
20.6%
4121
15.1%
3160
20.0%
2106
13.2%

Legendary
Boolean

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size928.0 B
False
735 
True
 
65
ValueCountFrequency (%)
False735
91.9%
True65
 
8.1%
2021-04-08T15:51:02.906174image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2021-04-08T15:50:48.378097image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:48.613913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:48.801829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:48.974876image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:49.151165image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:49.318777image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:49.493844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:49.659825image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:49.825726image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:49.984501image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:50.134737image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:50.424351image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:50.568485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:50.724592image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:50.880931image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:51.066210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:51.234898image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:51.398938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:51.588695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:51.743778image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:51.934305image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:52.124052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:52.299243image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:52.482824image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:52.683948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:52.852256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:52.997266image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:53.158942image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:53.306124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:53.481686image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:53.646549image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:53.816661image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:53.975058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:54.130204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:54.299448image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:54.460859image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:54.613931image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:54.755693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:54.902675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:55.040396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:55.188537image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:55.336228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:55.627366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:55.797898image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:55.959397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:56.128457image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:56.284231image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:56.454915image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:56.613837image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:56.783278image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:56.956966image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:57.116194image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:57.282692image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:57.431764image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:57.592719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-08T15:50:57.737240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-04-08T15:51:03.000512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-04-08T15:51:03.216248image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-04-08T15:51:03.429558image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-04-08T15:51:03.648703image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-04-08T15:51:03.854950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-04-08T15:50:58.031579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-04-08T15:50:58.353727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-04-08T15:50:58.561813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-04-08T15:50:58.669832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison4549496565451False
12IvysaurGrassPoison6062638080601False
23VenusaurGrassPoison808283100100801False
34Mega VenusaurGrassPoison80100123122120801False
45CharmanderFireNaN3952436050651False
56CharmeleonFireNaN5864588065801False
67CharizardFireFlying788478109851001False
78Mega Charizard XFireDragon78130111130851001False
89Mega Charizard YFireFlying78104781591151001False
910SquirtleWaterNaN4448655064431False

Last rows

#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
790791NoibatFlyingDragon4030354540556False
791792NoivernFlyingDragon85708097801236False
792793XerneasFairyNaN1261319513198996True
793794YveltalDarkFlying1261319513198996True
794795Zygarde Half FormeDragonGround1081001218195956True
795796DiancieRockFairy50100150100150506True
796797Mega DiancieRockFairy501601101601101106True
797798Hoopa ConfinedPsychicGhost8011060150130706True
798799Hoopa UnboundPsychicDark8016060170130806True
799800VolcanionFireWater8011012013090706True